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ABSTRACT 

We quantify and discuss the footprints of neutral hydrogen in the intergalactic medium (IGM) on the spectra 
of high-redshift (z ^ 6) sources, using mock spectra generated from hydrodynamical simulations of the IGM. 
We show that it should be possible to extract relevant parameters, including the mean neutral fraction in the IGM, 
and the radius of the local cosmological Stromgren region, from the flux distribution in the observed spectra 
of distant sources. We focus on quasars, but a similar analysis is applicable to galaxies and gamma ray burst 
(GRB) afterglows. We explicitly include uncertainties in the spectral shape of the assumed source template near 
the Lyman a line. Our results suggest that a mean neutral hydrogen fraction, xh of unity can be statistically 
distinguished from xh ~ 10"^, by combining the spectra of tens of bright (M w -27) quasars. Alternatively, 
the same distinction can be achieved using the spectra of several hundred sources that are 100 times fainter. 
Furthermore, if the radius of the Stromgren sphere can be independently constrained to within ^ 10%, this 
distinction can be achieved using a single source. The information derived from such spectra will help in settling 
the current debate as to what extent the universe was reionized at redshifts near z ~ 6. 



1. INTRODUCTION 

The epoch of cosmological reionization is a significant mile- 
stone in the history of structure formation. Despite recent ob- 
servational break-throughs, the details of the reionization his- 
tory remain poorly determined. The Sloan Digital Sky Survey 
(SDSS) has detected large regions with no observable flux in the 
spectra of several z ~ 6 quasars (Becker et al. 2001; Fan et al. 
2003; White et al. 2003). The presence of these Gunn-Peterson 
(GP) troughs set a lower limit on the volume weighted hydro- 
gen neutral fraction of xh > 10"^ (Fan et al. 2002). This strong 
limit implies a rapid evolution in the ionizing background from 
z = 5.5 to z - 6 (Cen & McDonald 2002; Fan et al. 2002; Lidz 
et al. 2002; Pentericci et al. 2002), and suggests that we are 
witnessing the end of the reionization epoch, with the IGM 
becoming close to fully neutral at z ~ 7 (but see Songaila & 
Cowie 2002 for a different conclusion). On the other hand, re- 
cent results from the Wilkinson Microwave Anisotropy Probe 
(WMAP) have uncovered evidence for a large optical depth 
to electron scattering, ~ 0.17 ±0.04 (Bennett 2003) in the 
cosmic microwave background anisotropics. Assuming a step- 
function model for the reionization history, this result would in- 
dicate that reionization began at z ~ 17 ± 4 (Kogut et al. 2003; 
Spergel et al. 2003). Although physically motivated "double" 
reionization scenarios, proposed prior to the WMAP result (Cen 
2003a; Wyithe & Loeb 2003) are consistent with the combined 
observations, the details of the reionization process remain far 
from being clear 

Numerous recent theoretical works have addressed resolu- 
tions of the apparent discrepancy (see, e.g., Haiman 2003 for 
a review). Successful models incorporate various feedback ef- 
fects, such as that due to metal-enrichment (Cen 2003b; Wyithe 
& Loeb 2003) or associated with the UV radiation produced by 
the early ionizing sources (Haiman & Holder 2003). Alterna- 
tively, the high-redshift "tail" of ionization has been attributed 
to an early population of X-ray producing black holes (Ricotti 



& Ostriker 2003; Madau et al. 2003) or even to something more 
exotic, such as decaying particles (Hansen & Haiman 2003). 
These competing reionization scenarios each predict a different 
evolution of the neutral fraction beyond z ^ 6. It would there- 
fore be quite beneficial to observationally determine the values 
of Xh at the intermediate redshifts of 6 < z < 30. 

As a first step towards this goal, differentiating between a 
neutral and a mostly ionized universe at redshifts just beyond 
z ~ 6, would akeady aid in discriminating between several mod- 
els (Haiman & Holder 2003). To do this however, one re- 
quires a more detailed statistic than the presence or absence 
of a Gunn-Peterson trough. In this paper, we utilize the trans- 
mitted flux distribution of a hypothetical z ^ 6 source near its 
Lya wavelength. While most of the flux on the blue side of 
Lya is simply extinguished for a wide range of neutral frac- 
tions (lO"-' "^^H < 1)^ detectable flux can be transmitted close 
to the line center, at wavelengths corresponding to a local HII 
region around the source (e.g. Cen & Haiman 2000; Madau & 
Rees 2000). The spectrum on the blue side, as well as the flux 
processed by the damping wing of IGM absorption and trans- 
mitted on the red side of the Lya line (Miralda-Escude 1998), 
depends on the hydrogen neutral fraction of the IGM in which 
the source is embedded. As a result, the flux distributions can 
be used as a probe of the neutral fraction in the IGM. 

There are two immediate apparent difficulties with the above 
approach. First, it requires an estimate of the intrinsic spectrum 
of the source. Second, it requires an ab-initio model of the 
density (and velocity) fields surrounding the source, which will 
influence the transmitted flux distribution. Since both of these 
quantities are impossible to predict accurately from first princi- 
ples for any specific source and line of sight, a single spectrum 
is unlikely to provide tight constraints on the neutral fraction. 
However, the hydrogen neutral fraction can still be inferred sta- 
tistically, by studying the spectra of a sample of sources. The 
purpose of this paper is to quantify the accuracy to which xh 
can be determined in a future sample of high redshift sources. 
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taking the above complications into account. We use a hy- 
drodynamical simulation to generate mock spectra of sources 
for different assumed values of the hydrogen neutral fraction of 
the IGM, and quantify the statistical confidence to which these 
transmitted spectra can be distinguished from each other. 

The rest of this paper is organized as follows. In § 2, we dis- 
cuss the basic signatures of neutral hydrogen in the IGM that 
should be imprinted on the spectrum of a background source. 
In § 3, we describe the method we use to produce mock spec- 
tra. In § 4, we present the statistical comparison between the 
various mock spectra, and assess the accuracy with which the 
input neutral fraction can be recovered in each case. In § 5, we 
discuss the relative merits of using different source types (qua- 
sars, galaxies or gamma ray burst afterglows), and some related 
issues. In § 6, we explore the benefits of independently con- 
straining the radius of the Stromgren sphereFinally, we sum- 
marize the implications of this work and offer our conclusions 
in §7. 

We use redshift, z, and the observed wavelength, Aobs, inter- 
changeably throughout this paper as a measure of the distance 
away from the source along the line of sight. These can be 
related by: (1 +z) = AobsAa, where Aq, = 1215.67 A is the rest- 
frame wavelength of the Lya line center. The proper distance 
from a source at redshift Zs to a point at redshift z, is deter- 
mined by r = dz' cj^, where c{dt/dz!) is the line element in 
a given cosmology, and the comoving distance is (1 +z) times 
the proper distance. For reference, in our case a proper distance 
of 6 Mpc corresponds to about AAobs ~ lOOA. 

Throughout this paper, we assume a standard ACDM cos- 
mology, with (rjA,17M,rJi„n,o-8, //()) = (0.71, 0.29, 0.047, 1, 
0.85, 70 km/s/Mpc), consistent with the recent results from 
WMAP (Spergel et al. 2003). Unless stated otherwise, all len- 
gths are quoted in comoving units. 

2. SPECTRAL SIGNATURES OF NEUTRAL HYDROGEN 

In this section, we summarize the spectral signatures of neu- 
tral hydrogen in the IGM. The spectrum emitted by a source at 
redshift Zs, Fq{\), will be modified around the Lya wavelength 
by absorption by neutral hydrogen atoms along the Une of sight, 
so that we observe F(A) = Fo(A/[l -i-z])exp(-r[A,Zs]). The to- 
tal optical depth due to Lya absorption, r, between an observer 
at z = and a source at z = Zs, at an observed wavelength of 
Aobs = Aj(l + Zj), is given by: 

where c{dt/dz) is the line element in a given cosmology, nniz) 
is the hydrogen number density at redshift z, xh(z) is the hy- 
drogen neutral fraction at redshift z, and cr(^) is the Lya ab- 
sorption cross section. Since high-redshift sources sit in their 
own highly ionized Stromgren spheres', the total optical depth 
at a given Aobs can be written as the sum of contributions from 
inside and outside the Stromgren sphere, t = tr + td. The reso- 
nant optical depth, tr, is given by: 

Ts(Aobs)= / dz riHiz) xh{z) a ( (2) 

Jzmi dz Vl+Z/ 
'The assumption of discrete HII regions is invalid if hard spectra dominate 
reionization. In this case, the surface of the HII region can have a width that 
exceeds the separation between the ionizing sources (Oh 2001; Venkatesan et 
al. 2001). There is stUl a proximity region around each source, within which 
they dominate the cosmic background, which could be used in the analysis 
below in place of the Stromgren sphere. 



and the damping wing optical depth for the IGM outside the 
Stromgren sphere can be obtained from 

TD(Aobs) = / """^^^ ^"^^"^ ( ) 

Zend ^ \ ^ / 

where zhii corresponds to the redshift of the edge of the Strom- 
gren sphere, and Zend denotes the redshift by which HI absorp- 
tion is insignificant along the line of sight to the source. For 
a smooth IGM, we would simply have mh(z) = n^.o (1 +z)^, 
and since the fall-off of the cross section is rapid compared to 
the change in x//(z), we can also approximate xh — constant, 
for the smooth IGM in eq. (3). Conversely, the fall-off of 
the cross section is slow compared to the small-scale fluctua- 
tions in the IGM, averaging out their contributions to td- For 
these reasons, in this paper we approximate xh — constant, and 
assume a smooth IGM. The value of Zend was chosen to be 
Zend = ZHn-0.5. As long as we are not looking through an- 
other source's Stromgren sphere close to znn, we find that the 
exact value of Zend is irrelevant because most of the contribu- 
tion to To comes from within Az < 0.5 of zhii- In particular, a 
choice of Zend = znn - 0.25 results in an average change in td of 
only ~ 4% compared to our fiducial model with Zend = znn -0.5 . 
The difference between equations (2) and (3) are: (1) the Um- 
its of integration, (2) the determination of xh, which assumes 
ionization equilibrium with the cosmological background flux 
7bg outside the Stromgren sphere with the approximation of 
Xh ~ constant, and equilibrium with the sum of the background 
and the local source fluxes, Jbg+Js inside the HH region with a 
fluctuating x//(z), and (3) at the relevant wavelengths where t« 
is significant, the integral in equation (2) is dominated by the 
resonant cross-sections, whereas at all wavelengths where flux 
is transmitted, the damping wing dominates equation (3). As 
a result of this last property, tr fluctuates strongly with wave- 
length, reflecting the local density field, whereas the damping 
wing contribution is smooth, since it averages the contributions 
to the damping wing from a relatively wide redshift interval. 

The two optical depths are represented in Figure 1 for the 
case of a bright quasar embedded in a neutral IGM at = 6. 
The opacities were obtained from a simulation (described be- 
low) for a typical line of sight towards a source residing at a 
cosmological density peak. Also shown is Rs, the radius of 
the Stromgren sphere, corresponding to the region of transmit- 
ted flux blueward of the Lya line center. It is worthwhile to 
note that there are different wavelength regions where either 
the resonant or the damping wing absorption dominates. These 
regions can shift according to the epoch being studied, since 
To scales linearly with xh, and also according to the quasar's 
luminosity L, since tr cc L"' (see eq. [4]). For example, since 
the damping wing is less dominant close to the Une center for 
lower x^f, details of the gas density and velocity distribution can 
be studied in this regime. Shortward of the edge of the Strom- 
gren sphere, no flux is observed for 10~^ < Xh < I, since the 
attenuation is large (ranging from ^ e~'^ - e"'^ for the val- 
ues of the neutral fraction studied in this paper). 

Simple modeling predicts several distinctive signatures that 
Xh should leave on a spectrum (see also Fig. 2 and accompany- 
ing discussion below). Most importantly: on average, a neutral 
universe would be expected to have a smoother spectrum on the 
blue side of the line, due to the larger contribution to the total 
optical depth from the IGM, which is a smooth function (see 
Fig. 1). 

In addition, the symmetry of the observed spectrum around 
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the Lya line center should be affected by xh- Since tr ^ 
on the red side of the Lya Une, the observed spectrum should 
trace out the emitted spectrum for low values of xh- For low 
values of xh, the red side of the line has negligible absorption, 
while the blue side is still affected by resonance absorption. 
This makes the observed spectrum highly asymmetric. In com- 
parison, in a neutral universe, the presence of a strong damping 
wing (td) causes additional strong suppression on the red side 
of the line, as well, making the line more symmetric (though 
overall weaker). Note that if we were to model out tr, leaving 
only the damping wing contributing to optical depth, a neutral 
universe would have a more asymmetric spectrum, because the 
damping wing would impose a sharper slope in the transmitted 
spectrum near the Lya line. Furthermore, in a composite spec- 
trum, one expects there to be a sharp drop in observed flux at 
the edge of the Stromgren sphere if the universe is mostly ion- 
ized. This occurs since the gradual cusp of the damping wing 
(immediately longward of the sharp rise in To, c.f. Fig. 1), is 
sub-dominant if the universe is mostly ionized, making the tran- 
sition from resonance-dominated to damping wing-dominated 
optical depth more sudden (see feature at 8390 A, in the lower 
panel of Fig. 3). On the other hand, if the universe is neutral, 
the damping wing cusp dominates at the edge of the Stromgren 
sphere, and creates a gradual drop-off in the observed flux (see 
the upper panel of Fig. 3). This effect is only observable if the 
Stromgren sphere is not large enough that tr, whose mean value 
scales as the square of the distance from the source, blocks off 
the observed spectrum by itself. 




Fig. 1. — Optical depth contributions from witliin (r^) and from outside 
(td) the local HII region for a typical line of sight towards a = 6 quasar 
embedded in a fully neutral IGM. The dashed line corresponds to td, and the 
solid line corresponds to tr. For details concerning the simulation, see § 3. In 
this example, the damping wing of the IGM, tq, contributes significantly to 
the total optical depth at Aobs ~ 8430 A and Aobs > 8470 A. 

The above effects manifest themselves in aU high-redshift 
sources (quasars, galaxies, GRB afterglows), and furthermore, 
they also scale predictably with the source strength and envi- 



ronment.^ As we will see below, the effects are difficult to de- 
tect or quantify for a single high-redshift spectrum, due to the 
uncertainty in the shape of the source's intrinsic emission, and 
to the stochastic nature of the density field along a given line 
of sight. However, with many sources such statistics can be 
usefully analyzed. 

3. SIMULATION OF SPECTRA 

We use a ACDM hydrodynamical simulation box at redshift 
Zj = 6 as the home of our source quasar. The simulations are de- 
scribed in detail in (Cen et al. 2004), and we only briefly sum- 
marize the relevant parameters here. The box is 11 /;"' Mpc on 
each side, with each pixel being about 25.5 /;"' kpc. This scale 
resolves the Jeans length in the smooth IGM by more then a 
factor of 10. The simulation also includes feedback in the form 
of realistic galactic winds. Stellar particles are treated dynam- 
ically as coUisionless particles, except that feedback from star 
formation is allowed in three forms: UV ionizing field, super- 
nova kinetic energy, and metal rich gas, all being proportional 
to the local star formation rate. Supernova energy and met- 
als from aging massive stars are ejected into the local gas cells 
where stellar particles are located. Supernova energy feedback 
into the IGM is included with an adjustable efficiency (in terms 
of rest-mass energy of total formed stars) of csn, which is nor- 
malized to observations (Cen et al. 2004, in preparation). 

We first identified the densest region in the box, as the natural 
location of a high-redshift source, such as a quasar. This cor- 
responds to a pixel with an overdensity of n/ («) ~ lO'' relative 
to the background, and to the center of a collapsed dark mat- 
ter halo with mass, Mhaio w 2 x lO^^M©. Density and velocity 
information was then extracted from 92 different lines of sight 
(LOSs), approximately evenly spaced in solid angle, originat- 
ing from this pixel. The step size along each line of sight (LOS) 
was taken to be 5.1 kpc, which resolves the Lya Doppler 
width by more than a factor of 40. The exact value of the step 
size was chosen somewhat arbitrarily, and does not influence 
the results as long as it adequately resolves the Doppler width. 
At each step, the density and velocity values were averaged for 
the neighboring pixels, and weighted by the distance to the cen- 
ter of the pixels. We extended each LOS by the common prac- 
tice of randomly choosing a LOS through the box, and stacking 
the pieces together (Cen et al. 1994). 

The hydrogen neutral fraction inside the Stromgren sphere, 
Xh{z), was calculated at each step in the LOS, and for sev- 
eral values of an isotropic background flux, Jbg{^) (in units 
of ergs"' cm~^Hz"' sr"'), corresponding to a particular Xh- As- 
suming ionization equilibrium, andxH(z) -C 1 inside the Strom- 
gren sphere: 



xh{z) = 



nuaB 



(4) 



where vh and <j are the ionization frequency and cross section 
of hydrogen, respectively, ag is the hydrogen recombination 
coefficient at T = 15,000 K, r is the luminosity distance between 
the source and redshift z, and is the quasar's intrinsic lumi- 
nosity in erg s~' Hz"' . The luminosity was taken to be = 
2.34 X 103l(^^/^'H)"'■^[(l+^)/(l+^.)r°■^ which results from 
redshifting a power-law spectrum with a slope of v~^-^, nor- 

^ Spectra of GRB afterglows originating in high redshift galaxies might not 
have noticeable tr, since the host galaxies could be too young or too faint 
to produce a significant Stromgren sphere (Barkana & Loeb 2004, Lamb & 
Haiman 2003). 
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malized such that the emission rate of ionizing photons per sec- 
ond is 2 X 10^^, matching the Elvis et al. (1994) template spec- 
trum and the brightness typical of the SDSS quasars. The back- 
ground flux, JBcit^), is also assumed to follow a v'^-^ power- 
law spectrum. Results are insensitive to the shape of the back- 
ground flux, since the dominant effect of JBoi'-') comes from the 
damping wing, and that only depends on the value of xh- Since 
the quasar's luminosity is dominant for the values of interest in- 
side the Stromgren sphere, we find that equation (4) can be very 
well approximated by x//(z) = A-nr^nHOiB/ J^iL^a /hv)dv. 

For several combinations of znn (corresponding to a Strom- 
gren sphere radius, Rs) and xh, the integrals in (2) and (3) were 
evaluated for each LOS. To expedite analysis, tq was also cal- 
culated using the approximation (Miralda-Escude 1998): 



td 



6.43 X 10"% 

I+Zhh 
1+z 



-/ 



mecH(z) 

1 +Zend 



(5) 



l+z 



where: 
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Equations (3) and (5) give very similar results for wavelengths 
away from resonance (i.e. inside the Stromgren sphere), but (5) 
is much quicker to compute. 

The Voigt profile was used to approximate the Lya absorp- 
tion cross section (Rybicki & Lightman 1979). We assumed a 
temperature of 15,000 K for gas inside the Stromgren sphere. 
Outside the Stromgren sphere, for the case of a mostly ion- 
ized universe, an IGM temperature of T = 10^ K was used, 
while the temperature in a neutral universe was taken to be T = 
2.73 X 151(J^)2 K, vaUd for z < 150 (Peebles 1993). Our re- 
sults are insensitive to the exact temperature used. The Doppler 
width of the Lya absorption cross section scales as vd oc T'/-^, 
but the total integrated area under the cross section is indepen- 
dent of temperature. 

Finally, to simulate observations, we had to smooth the raw 
spectra we compute. Physical smoothing due to gas pressure, 
present in the simulations on scales of ~ 10 km/s (Gnedin 2000) 
corresponds to smoothing on a wavelength range of AAobs = 
AA.5(1+Zj), or ^ 0.3 A at Zs = 6. However, current spectral 
resolutions achieved for high-redshift quasars (APO, Keck) are 
about a factor of three worse than this value. In order to simu- 
late a realistic spectral resolution of ~ 1 A, all resulting spectra 
were smoothed over 1 A bins (20 steps in a LOS) by averaging 
over each bin. 
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Fig. 2. — Simulated e"^ for two different LOSs (top and bottom) in two 
different regimes: jch = 1 (left panels) and xm = ().()()8 (right panels). All 
spectra correspond to Rs = 43 Mpc. The solid line corresponds to e^'^ from 
equation (1) and the discrete points correspond to e^^" from equation (2). The 
sharp absorption features in the figures are produced by tr, while the general 
trend of increasing optical depth away from the line center is a combination of 
both tr and td. 



4. INTERPRETATION OF SPECTRA 

For illustrative purposes, in Figure 2 we show four differ- 
ent examples for the spectrum of a source with Rs = 43 Mpc. 
We show spectra assuming two different values of xh'. 1 or 
0.008, corresponding to Jbg{i^h) = and 8 x 10"^^ erg s"' cm"^ 
Hz~^ sr~\ respectively. This range for the background flux is 
approximately what is allowed by the current analysis of known 
z ~ 6 quasars (e.g. Fan et al. 2003; Cen & McDonald 2002). 
The value of Rs was chosen because it is representative of the 
brightest SDSS quasars, and it corresponds to the radius of a 
sphere enclosing ~ 2 x 10^' ionizing photons per second x lO'^ 
s hydrogen atoms (Cen & Haiman 2000b). The value of the 
source's Stromgren sphere radius is left as a free parameter to 
be determined by the inversion method (described below). 

Also, shown in Figure 2 is the effect of the damping wing in 
smoothing out the spectrum. The flux decrement due to reso- 
nance absorption, e""^", is shown with discrete points, and the 
total flux decrement, e"^*"^", is represented with a solid line for 
two different LOSs (top and bottom). Flux decrements from 
a neutral universe {left panels) are visibly smoother than those 
of a mostly-ionized universe {right panels), as predicted. The 
sharp absorption features in the figures are produced by r«, 
while the general trend of increasing optical depth away from 
the Une center is a combination of both r« and td- 
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Fig. 3. — Composite mean spectra from 92 LOSs, for the cases of j;;^ = 1 {top 
panel) and xh = 0.008 {bottom panel). For illustrative purposes, a Gaussian + 
continuum emission template is overlaid {solid line), and the resulting mean 
observed spectrum is shown underneath {dashed line). The source quasar is 
at redshift, Zs = 6, and the Lya line center is at 8509.69 Mdotted line). The 
spectrum shown in the top panel is measurably smoother than the one shown 
in the bottom panel. 



In Figure 3, we show the composite mean spectra from 92 
LOSs, for the two different values of Xh- 1 {top panel) and 
0.008 {bottom panel). For illustrative purposes, a Gaussian + 
continuum emission template is overlaid {solid line), and the re- 
sulting mean observed spectrum is shown underneath (dashed 
line). The emission template was chosen somewhat arbitrarily, 
in order to resemble the typical spectrum of observed quasars 
(e.g. Vanden Berk et al. 2001), and is comprised of a flat con- 
tinuum emission added to a Gaussian of width 2500 km/s and 
a peak-to-continuum ratio of ^ 5. From Figures 2 and 3, we 
can see that the composite mean spectra for these regimes ex- 
hibit the traits discussed in § 2: spectra from a neutral universe 
are smoother on the blue side of the line and more symmetric 
around the hne center (at 8509.69 A); however, if the tr contri- 
bution is statistically modeled out, the xh = 0.008 spectra would 
be more symmetric; also there is a sharp drop in the compos- 
ite observed flux at the edge of the Stromgren sphere in the 
Xh = 0.008 universe, that is not present in the neutral universe 
composite spectrum. The latter feature is only statistically de- 
tectable from a large sample of LOSs, and was not used in our 
analysis below. In the analysis below, we ignore the ^ 5 A 
range redward of the edge of the Stromgren sphere, since in 
this region, the exact structure of the transition region from the 
Stromgren sphere to the IGM contributed significantly to td, 
and the transition region can have a large spectrum-to-spectrum 
variabiUty. 

4.L Inverting the Observed Spectrum to Findxn 

We are now presented with an interesting problem: suppos- 
ing the only information we have is the observed spectrum, can 
we extract the quantities of interest, namely xh and Rsl For 



the purposes of an ideahzed analysis, we will first assume that 
we know the shape of the intrinsic emission template to infi- 
nite precision (however, we do not assume knowledge of the 
amplitude of the template). In the next section, we shall re- 
lax this restriction. The other major assumption we implicitly 
make is that we can accurately predict the distribution of tr, 
using the small-scale power statistics from the simulation box. 
This assumption can be checked by studying more extensive 
simulations in the future (see discussion in § 5 below). 

Before diving into the details, we first outline the main steps 
of the procedure. We start with a simulated observed spec- 
trum. Then we guess values for the radius of the Stromgren 
sphere, R'^, and the IGM hydrogen neutral fraction, x'fj. Next 
we approximate the amplitude of the source's intrinsic emis- 
sion, A', implied by the choices of R'^ and x^, using the red 
side of the Lya line where resonance absorption can be ne- 
glected. From the observed spectrum, we divide out the as- 
sumed intrinsic emission (A' x known spectral shape), and the 
assumed damping wing flux decrement, e'^o(Aobs,«s,XH)^ calling 
the result ^'(Aobs)- If our choices of R'^ and xj^ were correct, 
'^'(•^obs) should represent the resonance absorption flux decre- 
ment alone. Hence, we compare a histogram of the implied res- 
onance optical depths, -ln[5'(Aobs)], to the known histogram 
of resonance optical depth (obtained from the simulation). We 
then repeat this procedure with different choices of R'^ and x^, 
finding the ones whose implied resonance optical depths most 
closely match the known histogram. We shall now elaborate on 
this procedure below. 

We start with the mock observed flux, of the form: 



(6) 



where A is the amplitude of the adopted normalized emission 
template r(Aobs), and TR(Aobs) = TR{Xobs,Rs,XH) also takes into 
account the small contribution of Jbg to xh{z). Since we as- 
sume we know the shape of the source's intrinsic emission, for 
simphcity we shall set ^(Aobs) = 1 • Actual features in the spec- 
trum will only effect the signal-to-noise of detection under this 
optimistic assumption (see discussion below). Next, we guess 
values for7?j andxj^. For these values, TD{\red,R's,x'fj) is calcu- 
lated for a wavelength, Xred, on the red side of the line. We then 
estimate the amplitude of the intrinsic emission template, which 
these values would imply at Xred- A' = F{Xred) 
Since for the red side of the hne, tr — > 0, we are left with the 
foUowing estimate of the emission amplitude: 



A' ~ (A e'^o(Ked,RsyH)-'^D(Ked,Rs^H)^ 



(7) 



where for increased accuracy. A' can be averaged over several 
values of Xred, corresponding to a smooth region in the red side 
of the observed spectrum. Since our simulated spectra are well- 
behaved, averaging was not needed, and A' was evaluated at an 
arbitrarily chosen Xred = 8514 A. It should be noted that, when 
using real spectra, one has to be careful to chose Xred values 
that are not near other emission lines. 

Next, we divide the input flux by A', and extract an esti- 
mate of the resonant optical depth, ■7^(Aobs), using the damp- 
ing wing contribution, e'^''^^'^^''^s'^"\ for observed wavelengths 
on the blue side of the line: i7{(Aobs) = -In (F(Aobs) {A')~^ 
g-rD(A„b„R^,4)^^ which reduces to: 
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Fig. 4. — Histogram of the template distribution tr {top left), and liistograms 
of the derived distribution: (xh.Rs) = (1,43 Mpc) (correct values) (top 
right), (xh.Rs) = (0.2,43 Mpc) (bottom left), (xh,Rs) = (1,85 Mpc) (bottom 
right). We test the hypothesis that the top right, bottom left and bottom right 
histograms (among many others in parameter space) were drawn from the dis- 
tribution in the top left. We find that the top right panel is consistent with being 
drawn from the distribution in the top left, and that the bottom panels are not. 



7'ij(Aobs) = -ln(e 



TR(Xobfi) 



(8) 



where = TD(\obs,Rs'X'„) and = TB(Xohs,Rs,XH)- The first 
term on the RHS of equation (8) represents our misestimate of 
the damping wing contribution on the blue side, and the second 
term represents our misestimate of the continuum inferred from 
the red side. 

Then, a final correction is applied to remove the contribution 
of the assumed background flux corresponding to our guess of 
x'jj, J'sQiv), to the hydrogen neutral fraction inside the Strom- 
gren sphere, xh{z), defined by (4): 



Tj^(Aobs) = Tj?(Aobs) 



(9) 



where T^(Aobs) is our estimate for the resonance optical depth at 
Aobs, for our guess of and normalized to the resonance 

optical depth in a neutral universe (i.e. Jbg = in eq. [4]) for 
purposes of comparison. 

A histogram of Tj^ is constructed, and compared to the known 
template histogram of tr, extracted from the combined data 
from 92 smoothed mock spectra, created by embedding the 
source in a neutral universe (see Figure 4). In order to ex- 
clude spectral pixels that would be too faint for an actual flux 
measurement, the inferred values of > 16 were discarded 
from the analysis. This choice is motivated by the current lim- 
its available from the z ~ 6 SDSS quasars. The current best 
lower Umit on the optical depth just blueward of the Lya line is 
actually worse (t ~ 6; White et al. 2003), but since the higher 
Lyman series lines have lower opacities, a better lower limit, 
T w 22, is available from the Ly/3 region of the spectrum (White 



et al. 2003). In principle, the Ly/3 (and Ly7) spectral regions 
could be added to our analysis. A test statistic is used to 
compare the two templates: 



X 



■■N, 



points 



fi 



(10) 



where Npoims is the number of values extracted from the spec- 
trum, fi and Fi are the fraction of values expected and received 
in bin i, respectively. The deviations are approximately Gaus- 
sian, af = Npoims fi- 

This procedure is then repeated for many choices of R'g and 
x'fj (shown below), and the test statistic in equation (10) is 
minimized. 
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Fig. 5. — Probability contours for the inferred neutral fraction and Stromgren 
sphere radius, generated by the analysis in § 3.1. The background IGM is 
assumed to be neutral, and the X marks the position of the true value of Rs 
and xh- Top panel: The inversion procedure is applied to a single source. The 
90% confidence contours (not shown) enclose only the true {xh, Rs) values, 
to within an error of (30%, 5%), respectively (only the single pixel containing 
the correct (fiducial) values is enclosed within these contours). Confidence 
contours of 99% are shown, and constrain the IGM neutral fraction to xh ^ 
0.1. The irregular, island shaped contours are the result of the unique features 
of a single LOS. Bottom panel: ProbabiUty contours generated by averaging 
the same analysis over five random LOSs (averaging x values from eq. [10]) . 
The irregular, 99% confidence peaks from the top panel are smoothed out, and 
the contours are tighter. The correct (xh, Rs) values were chosen with 99% 
confidence, to within an error of (30%, 5%), respectively (contour not shown). 
The 99.73% and 99.99% contours (solid and dashed lines, respectively) are 
shown, ruling out axH < 0.1 universe at better than 4a significance. 



The results of this procedure can be seen in Figures 5 and 6, 
and are quite encouraging. These figures show UkeUhood con- 
tours for combinations of (x^,/?^) for two cases of the true pa- 
rameters, {xh,Rs) = (1, 43 Mpc) and {xh,Rs) = (0.008, 43 Mpc). 
The statistical constraints shown in these figures are obtained 
from one typical LOS to a single source. Both cases had a min- 
imum reduced 1 for 22 degrees of freedom (d.o.f.), corre- 
sponding to two fitted parameters (d.o.f. = {Nbins-'^)-2). The 
Xh X Rs probability grid in Figures 5 and 6 consists of 25 x 35 
grid points, respectively. 

In the case of a neutral universe (top panel of Fig. 5), the 
correct values of (xh,Rs), to within an error of (30%, 5%), re- 
spectively, are identified by our procedure at 90% confidence. 
We do not display these contours, as they would be represented 
by a point in the figure. Confidence contours of 99% are shown, 
and constrain the IGM neutral fraction to xh > 0.1. The irreg- 
ular, island contours in the top panel of Fig. 5 are the result 
of the unique features of a single LOS, and get smoothed out 
when averaged over several LOSs, as seen in the bottom panel. 
The bottom panel of Figure 5 shows the probabiUty contours, 
averaged over five different random LOSs. The correct (xh, Rs) 
values were chosen with 99% confidence, to within an error 
of (30%, 5%), respectively. The 99.73% and 99.99% contours 
(solid and dashed Unes, respectively) are shown, ruUng out a 
X// < 0. 1 universe at better than 4(t, and an xh < 0.7 universe at 
3(7. As expected, confidence contours get tighter as more LOSs 
are analyzed. 

For the case of an ionized universe with xh ~ 0.008 in Fig. 
6, the Icr uncertainty contours (not shown) enclose the correct 
parameters, but the contours obtained from the single LOS are 
not as tight as in the neutral universe case, as was predicted in 
the introduction. Nevertheless, a neutral universe was ruled out 
at the 99% confidence level, as shown in Figure 6. 

It is worthwhile to note that the iso-contours in Figures 5 
and 6 go from bottom right to top left, indicating a degeneracy 
between small xh, small Rs, and large xh, large Rs solutions. 
This is to be expected since the contribution of td to the ob- 
served spectrum can be diminished both by moving the edge of 
the Stromgren sphere further away from the source (increasing 
Rs), or by having a more highly ionized IGM (decreasing xh). 
Due to the hmited spectral range used, the determination of the 
amplitude of the host's intrinsic emission. A', is most affected 
by this degeneracy; however, since the shapes of the damping 
wings are different in these scenarios, the degeneracy is not ex- 
act and can be lifted by increasing the number of sources used 
in the analysis. 

A useful by-product of this procedure is that it produces an 
estimate of the amplitude of the intrinsic emission (A' = A for 



the correct choice of {xh,Rs))- 
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Fig. 6. — Probability contours for the inferred hydrogen neutral fraction 
and Stromgren sphere radius, generated by the analysis in § 3.1. The analysis 
is applied to a single source. The background IGM is assumed to have xh = 
0.008, and the X marks the position of the true value of Rg and xh. Contours 
denote the 99% confidence limits, and rule out a neutral universe. 

An interesting issue of biasing can be raised here. Since the 
quasar sits in an overdense region, histograms of tr could have 
a rather large scatter close to the line center due to the increased 
spread in the density distributions and the pecuhar motion of 
the gas within the infall region around the halo (e.g., Barkana 
& Loeb 2003). This biasing effect was studied, and on aver- 
age, we find that the density field reaches within 15% of the 
mean density ~ 5 A blueward from the observed fine center, 
for our redshift Zs=6 quasar. However, the distribution of den- 
sities at a fixed distance has a high-end tail, comprised of a few 
LOSs. Although we can model this bias for our quasar, one 
would need the density profiles surrounding many host sources 
to fully explore these statistics for a larger sample of quasars. 
This is feasible in principle, in large future simulations. How- 
ever, for the purposes of generality, in the present paper the blue 
side of the spectra was cut-off 5 A blueward of the Une center 
and omitted from our analysis. Envirormient biasing will be 
discussed in detail in § 5.3. 

Another uncertainty can arise if the redshift of the Lya line 
center is not well determined. An offset in the line center of 
±1000 km s"' (typical of quasar jets), results in an effective Rs 
offset of only ±1.3 comoving Mpc. A larger systematic offset 
of Az ~ 0.01-0.05 in the line center, corresponding to ARs ~ 
4-20 Mpc, can be present if an associated metal line with a 
well-determined redshift has not been detected. This can rep- 
resent a non-negligible misestimate in the radius of the ionized 
region (especially for faint sources), and underscores the impor- 
tance of accurate redshift-determinations from metal-line de- 
tections. Nevertheless, we note that even these relatively large 
Rs uncertainties do not have a major impact in distinguishing 
among the Xh values of interest, which are separated by two or- 
ders of magnitude. This can be seen explicitly by noting that 
a vertical change of order ARs ~ 4 - 20 Mpc along the proba- 
bility iso-contours, seen in Figures 5 and 6, corresponds to less 
than an order of magnitude change in the value of the neutral 
fraction. This is the case even for fainter sources, as discussed 
below. 
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4.2. Simulating Error in the Emission Template 

The assumption of knowing the shape of the quasar's intrin- 
sic emission is not as outlandish as it might seem. Vanden 
Berk et al. (2001) created a composite quasar spectrum from 
~ 150SDSS spectra of quasars around z~ 3. Even though there 
were variations in amphtude, the normahzed, mean template 
was defined around the Lya line to better than 5%. Although 
spectrum- to- spectrum variations are considerable in the core of 
Lya and blueward, most of these variations are accounted for 
by the Baldwin effect (the decrease of emission line width with 
increasing luminosity) and the absorption by the Lya forest, 
both of which can be statistically removed. Hence, for this pa- 
per, we assumed spectrum- to- spectrum variations of < 20%; a 
typical value of the r.m.s. scatter in the continuum level imme- 
diately redward of Lya (Vanden Berk et al. 2001). 

Fan et al. (2003) demonstrate that the high-redshift quasars 
follow Lya emission hue and continuum spectral shapes that 
are remarkably similar to that of their low-redshift counter- 
parts. More precisely, the high redshift sample size is small, but 
the spectra of the present handful of z ~ 6 quasars are consis- 
tent with being drawn from the Vanden Berk et al. distribution 
of spectral shapes (Fan 2003, private communication). 

Here we address the effect of the uncertainties of the as- 
sumed spectral template in two ways: we assume (1) an un- 
known overall "tilt" in the spectrum, or (2) an unknown pixel- 
to-pixel, uncorrelated, normally distributed scatter around a 
well-determined mean spectrum. These uncertainties add ex- 
tra parameters in our analysis, and make the constraints on the 
neutral fraction less tight. However, as we will argue below, 
tight statistical constraints on the neutral fraction can still be ob- 
tained with a sufficiently large sample of high redshift sources. 

4.2.1. Tilt in the Emission Template 

The uncertainty in the emission template was first modeled 

with a pivoting procedure, chosen to be hinged at the wave- 

piv — ^red 



length used for the amplitude estimation in eq. (7), A 



= 85 14 A. This mimics an incorrectly chosen power-law for the 
quasar's continuum emission (and also a tilt in the Lya emission 
line profile). Such a tilt would also characterize an uncertainty 
in the power-law index of GRB afterglow spectra (see § 5.2). 

Thus, the observed flux is assumed to be of the form: F(Aobs) 
= A (^)« e-7-«(Aote)-rD(A„ta,«s,A:H) Furthermore, equation (8) is 

modified to: 



T«(Aobs) = -ln(.™(^*> e-«--)-ln I ^ 



(11) 



where a' is now a guess for the spectral slope a. 

The shape of this power-law tilt is similar, but not mathemat- 
ically degenerate with the shape the damping wing. It therefore 
causes constraints to degrade. Here we characterize this degra- 
dation using: 



(12) 



where {Ntm) is an estimate of the number of data points (i.e. 
spectral resolution bins) required to distinguish between the 
two spectrum shapes, and cr,,;, = Ato-,/Atr, with Atd-? = 

(|ln(i.^ 



^o)-ln(^)" " I) being a measurement of the typ- 



shape of the power-law and the shape of the damping wing flux 
decrement, and Atr w 0.1 is the width of the template tr his- 
togram. 

We find that the tilt in the emission template obtained with a 
misestimate of a is not degenerate with the shape of the damp- 
ing wing for our quasar. The largest value of {Ntm), obtained 
from our parameter space occurred for the degeneracy between 
the fiducial values of {xh = 0.008, Rs = 43 Mpc) and the pa- 
rameter choice of (x^ =l,R'g = 99 Mpc) with a tilt misestimate 
of a-a' « -3. For these values, {N,ii,) ^ 10^, corresponding 
to, e.g., 10 similar quasar spectra with 100 usable independent 
spectral resolution elements. 

4.2.2. Pixel-by-Pixel Errors 

The uncertainty in the emission template was next modeled 
assuming uncorrelated, Gaussian distributed errors. Such pixel- 
by-pixel uncertainties could also represent the noise associated 
with the flux detection in each bin. Hence, the value of the 
emission template, r(Aobs), in the input flux in equation (6), 
now becomes a Gaussian distributed random variable with a 
mean value for each wavelength given by (r(Aobs))- The inver- 
sion equation to replace equation (8) then becomes: 



TR{\>hs) = - ln(e 



hi 



)-ln 



(13) 



7'(Aobs) 



,(r(Aobs)). 

In the case of small deviations from the template value, 



ical spread of the estimated t'r obtained from differences in the 



(r(A„bs)> 

~ 1, ')^(Aobs) is approximately Gaussian distributed around [- 
lii(gT«(A„b,) gri-TD)_in (^)] The exact shape of the probabil- 
ity distribution of tr resulting from the Gaussian uncertainty in 
r(Aobs) was calculated and convolved with the template tr dis- 
tribution (top left panel in Fig. 4). This new, normalized, tem- 
plate histogram was used as the new probabihty density func- 
tion, fi, in the statistic shown in (10). 

To test the robustness of our results to these pixel-by-pixel 
errors, a random LOS was drawn from our pool of LOSs, a 
Gaussian distributed emission template, ^(Aobs), was generated 
to create the mock spectrum, and the inversion procedure out- 
lined in § 4.1 was performed, using equation (13) to generate 
the histograms. This procedure was repeated, updating the 
values in the parameter space with each newly processed 
LOS (each new LOS representing a different source in a hy- 
pothetical sample) until we were able to distinguish a neutral 
universe from a xh < 0.008 universe with 99% confidence. 

The results for the number of sources needed for the 99% 
confidence constraints are sunrniarized in Table 1. The emis- 
sion uncertainties refer to the standard deviation of the pixel- 
by-pixel Gaussian distributed errors in the emission template. 
With a 20% uncertainty in the intrinsic emission template, an 
average of, {Nws) ~ 34 LOSs were required to rule out a neu- 
tral universe when the true value of the hydrogen neutral frac- 
tion was Xh = 0.008. In the = 1 regime, only ~ 3 spectra on 
average were required to rule out a. xh < 0.008 universe with 
99% confidence. This should be expected, since as discussed 
previously, the neutral IGM leaves a heavier footprint on the 
quasar spectra for a reasonably sized Stromgren sphere. 

Aside from the implicitly assumed calibration and elimina- 
tion of the Baldwin effect, the adopted 20% uncertainty does 
not make use of any information (such as correlations with 
other observables) from the observed spectrum being processed 
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to further improve constraints. Preliminary results from a prin- 
cipal component analysis (PCA) of the spectrum-to-spectrum 
variations suggest that one might be able to characterize the 
variance in the emission template with only three eigenvalues 
(Vanden Berk 2003, private communication). It is likely that 
with such characterization of the variance and the usage of in- 
formation redward of Lya, the template uncertainty can be re- 
duced, thus reducing {Nws}- With a 10% pixel-by-pixel uncer- 
tainty, an average of 25 LOSs were needed to rule out a neutral 
universe for x;i = 0.008, and 3 to rule out an xh < 0.008 uni- 
verse for = 1. A 5% uncertainty brings the average number 
of required spectra down to 14 and 2, for xh = 0.008 and xh = 1, 
respectively. These results, as well as the r.m.s. deviations, are 
summarized in Table 1 . 

Table 1 

Average number of bright quasars required to 

DISTINGUISH BETWEEN = 1 AND Xh < 0.008 WITH 99% 
CONFIDENCE. 



Emission 


{Nws} 


with 


{Nws} 


with 


Uncertainty 


Xh = 


1 


Xh 


= 


.008 


20% 


3.42 ± 


1.95 


33.6 ± 


20.8 


10% 


3.39 ± 


2.15 


24.6 ± 


20.0 


5% 


2.45 ± 


1.13 


16.! 


i± 


14.2 



5. MERITS OF DIFFERENT SOURCE TYPES 

Although the preceding analysis was done with simulated 
quasar spectra similar to the bright SDSS high-redshift quasars, 
it can be repeated on other high-redshift sources, namely qua- 
sars with lower luminosities, high-redshift galaxies and GRB 
afterglows. The properties of the sources scale predictably with 
their luminosities, with the two most pertinent to this analysis 
being the size of the source's Stromgren sphere and the shape 
of the source's intrinsic emission spectrum. 

5.1. Stromgren Sphere Size 

The size of the source's Stromgren sphere, Rs, approximately 
scales as the source's (luminosity)'/-'. A fainter quasar, of sim- 
ilar age, might therefore have fewer measurable spectral points 
inside the Stromgren sphere, with which to create the his- 
togram, which weakens constraints. On the other hand, since 
the Stromgen sphere is smaller, the damping wing has a stronger 
effect on the spectrum, which strengthens constraints. 

To estimate the overall effect on the determination of the 
neutral fraction, we repeated the pixel-by-pixel analysis out- 
lined in § 4.2.2 on a 100 times fainter quasar (L^ = 2.34 x 10^' 
(jy/iynT^'^ [(1 +z)/(l +z.s)]~'' **) with a correspondingly smaller 
Stromgren sphere (Rs « 9.3 Mpc). From simulated spectra of 
such mock quasars in a neutral universe, we find that a xh < 
0.008 universe is ruled out at 99% confidence with ~ 3 LOSs. 
This result is therefore comparable to that available from the 
brighter quasars (shown in Table 1). This can be understood 
by realizing that even though fewer spectral points are avail- 
able for the analysis of a fainter source (since the Stromgren 
sphere is smaller), these points are those close to the edge of 
the Stromgren sphere where the damping wing has a sharper 
slope, and are given more statistical weight due to the lack of 



points far away from the edge of the Stromgren sphere, where 
the damping wing is flatter. On the other hand, we find that 
approximately 10^ - 10^ faint quasar LOSs are needed on av- 
erage, for the emission template uncertainties of 5%-20%, in 
order to break the degeneracy between the damping wing of a 
small Xh, small Rs and a large xh, large Rs parameter choice. 
Since the probability iso-contours go from bottom right to top 
left in the parameter space of Figure 5, limiting parameter space 
to Rs < 100 Mpc (as we had implicitly done above) allowed for 
a stronger degeneracy in an ionized universe in the case of a 
small Rs than in the case of the larger Rs quasar used through- 
out the preceding analysis in this paper. If instead we limit 
parameter space to Rs < 30 Mpc, the fainter sources are able 
to distinguish between a neutral universe and axn < 0.008 uni- 
verse, using again a comparable number (i.e., tens) of sources 
to the brighter quasars shown in Table 1. Whether Umiting the 
size of the Stromgren sphere to 30 Mpc is a reasonable prior 
assumption depends on the actual inferred neutral fraction. The 

-1 /3 

size of the Stromgren sphere scales as x^' , and for sources 
that are ~ 100 times fainter than the z ~ 6 SDSS quasars, they 
would require > 10^ years to reach a 30 Mpc size even if em- 
bedded in an IGM with neutral fractions as low as xh = 0.008. 

An implicit drawback of fainter sources is that they have 
smaller S/N than bright sources, thereby reducing the effective 
detection threshold. Becker et al. (2001) and White et al. (2003) 
state the Ic lower limits to the Lya optical depth inferred from 
the Lya trough to be t/,„ ^ 5 or 6, for the Keck ESI spectra 
of z ~ 6 quasars. As stated previously, the analysis in this pa- 
per ignored inferred values of > 16. Because of the sharp 
drop in the high-end tail of the tr histograms (see Fig. 4), the 
inversion procedure presented in § 4. 1 is not sensitive to the 
exact value of the detection Umit. In our template histogram, 
only 5% of the values were within the range 6 < tr < 16, ap- 
proximately corresponding to data points used in the analysis, 
which are undetectable with Tii„ ^ 6? However, a 100 times 
fainter quasar would have its optical depth detection threshold 
reduced by ln(100)=4.6 to obtain equivalent S/N for the same 
integration time. For our mock spectra, this would eliminate an 
additional 20% of the tr values, in the range of 1.4 < tr < 6. 
Hence, fainter sources have a smaller range of usable r« values 
than brighter sources. Nevertheless, higher S/N observations 
would decrease this effect. 

Sources that are too faint to have many detectable spectral 
points on the blue side of the line, such as faint galaxies and 
GRB afterglows, would not be useful in this r« histogram anal- 
ysis, but would still have a strong damping wing imprint on 
their spectrum redward of Lya. As mentioned previously, red- 
ward of Lya, Tfi is negligible, and the small-scale power anal- 
ysis presented in § 4. 1 is irrelevant. If the sources are intrinsi- 
cally faint enough, the slope of the damping wing redward of 
Lya is sufficient to distinguish between a small Rs, small xh 
case and a large Rs, large xh scenario with only a few sources. 
As emphasized recently by Lamb & Haiman (2002) and Bar- 
kana & Loeb (2004), GRB afterglows with a clean power-law 
intrinsic spectrum would be especially well suited to such an 
analysis, provided they can be identified in future datasets (e.g. 
of Swift). 

As an example, the usefulness of the red side of the Lya line 
in determining xh was also investigated for our simulated faint 
quasar with /?s « 9.3 and xh ~ 0.008, assuming pixel-to-pixel 

^There is a small bias associated with the fact that the detection limit, T(,m, is 
a limit on the total optical depth, tstr+td, and not just tr. 
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Gaussian scatter in the emission template. In principle, suffi- 
ciently far on the red side of the Lyman a line, where reso- 
nant absorption does not contribute, this analysis could be per- 
formed without a numerical simulation. However, due to the 
cosmic infall, some of the pixels immediately on the red side 
of the Lya hne center correspond to foreground gas between 
us and the quasar (the resonant absorption extends, in our case, 
up to 7-8 Angstroms to the red, see Figure 8). For this rea- 
son, we did use the simulation in the analysis. Alternatively, to 
avoid uncertainties due to modeling the cosmological infall, one 
could omit the spectral regions that are blueward of the infall 
regime. Here we utiUzed the flux in the observed wavelength 
range of 85 14 A - 8544 A and performed a simple comparison. 
A more robust analysis of the red side would involve mini- 
mization in the 3-dimensional parameter space of {xh, Rs, A), 
where A is the ampUtude of the intrinsic emission. However, 
for purposes of comparison to the tr histogram analysis pre- 
sented in § 4. 1 , it was sufficient to obtain an estimate of A using 
an assumed region in the observed spectrum with an accurately 
known template shape, and calculating A' as outlined in § 3.1 
and equation (7). For a choice of and R'g, we therefore obtain 
an estimate of the flux decrement due to resonance absorption. 



(r(A„bs)) 



which expands to: 



<^/f(Aobs) = e 



A TjXobs) 

A' inxobs))' 



(14) 



For the correct parameter choices, c/«(Aobs) ~ e"'^**'^°'»^ w 1, 
since tjj — > on the red side of the hne and redward of the 
infall regime. The mean flux decrement, (<iR(Aobs)), averaged 
over several randomly chosen LOSs, was compared to the av- 
erage decrement obtained from the simulated spectra at each 
wavelength, (^e''^"^^'*'^), with the Xr statistic: 



({e-^'^^"'">)-{dRiXobs))f 



cr.., 



(15) 



where the summation limits were chosen to be l^ain = 8514 A 
/max = 8544 A and am is the standard deviation of the intrin- 
sic emission template divided by the square root of the number 
of LOSs used in determining the mean decrement, ((^^(Aobs))- 
LOSs were chosen at random, and the mean decrement and test 
statistic in equation (15) were updated until a neutral universe 
could be ruled out at 99% confidence. 

Results from this procedure are quite comparable to the tr 
histogram analysis on the blue side of the Lya line for our 
faint quasar with Rs ~ 9.3 and xh ~ 0.008. On average, we 
find that 10^ - 10^ LOSs were needed to rule out a neutral uni- 
verse at 99% confidence for template uncertainties of 5%-10%, 
just as in the tr histogram analysis. Although the red side is 
cleaner (negligible resonant absorption), it is further away from 
the edge of the Stromgren sphere, so the damping wing shape 
is flatter and is therefore harder to detect. However, our results 
suggests that if a larger wavelength range is available for the 
analysis, or if the source has a smaller Stromgren sphere a 
few Mpc), analyzing the red side of the Lya line could prove 
more efficient than the blue side in determining xh-^ 

'^There is an additional complication that an intervening damped Lyman alpha 
(DLA) system along the line of sight causes absorption whose shape is partially 
degenerate with that due to the damping wing from a smooth IGM (Miralda- 
Escud6 1998; Barkana & Loeb 2004). However, the degeneracy is not exact, 
and lines of sights with DLAs should be rare. 



Furthermore, very faint sources have other tracers of xh- For 
example, Haiman (2002) showed that the shift in the peak of the 
observed Lya emission line (relative to other emission fines), or 
its measured asynmietry, as a function of the source luminosity, 
can be used as a probe of the IGM hydrogen neutral fraction. 

5.2. Intrinsic Emission 



Our knowledge of the source's intrinsic emission varies con- 
siderably with source type. Again, it is only the imperfect 
knowledge of the shape of the emission spectrum that compli- 
cates the analysis presented here (i.e. knowledge of the overaU 
amplitude of the spectrum is not needed). On average, quasars 
seem to have a moderately well constrained emission template 
around Lya, with an uncertainty of ~ 20%, once the spectrum 
is normaUzed and systematics such as the Baldwin effect are ac- 
counted for. It is possible that this uncertainty could be further 
decreased with a principal component analysis (PCA), which 
has yielded well-characterized spectrum-to-spectrum disper- 
sions on the red side of the Lya line (Vanden Berk 2003, private 
communication). In comparison, the spectral shape of emission 
from galaxies is more poorly defined around Lya (Steidel et 
al. 2001). One would therefore need larger samples of Lya 
emitting galaxies at both lower redshifts and higher redshifts 
for a better empirical calibration of their spectral shape and its 
dispersion. Such samples may soon be available from exten- 
sive Lya searches using HST (Rhoads et al. 2003) and Subaru 
(Taniguchi 2003). GRB afterglows have a smooth power-law 
spectrum shape. There appears to be a scatter in the power-law 
index of roughly ~ 10% in photometric data on low-redshift 
bursts. The sample of higher-redshift bursts, whose spectral 
shape can be studied in more detail around the Lya line, is 
still too small to quantify spectrum-to-spectrum dispersions, 
but they appear to closely follow power-laws (Mirabal et al. 
2003). Note that a pure power-law uncertainty is not degener- 
ate with the shape of the damping wing (see § 4.2.1). The major 
issue with GRB afterglows will be whether the high-redshift 
afterglows can be identified among the lower-redshift events 
(see, e.g.. Lamb & Reichart 2000 and Barkana & Loeb 2004 
for recent discussions). 

5.3. Environment Bias 

The source's host environment has a large effect on the ob- 
served spectrum near the line center, as emphasized recently by 
Barkana & Loeb (2004). To accurately model this effect, and 
to test semi-analytical models (Barkana 2004) hydrodynamical 
simulations are needed to generate the local density and veloc- 
ity fields. In Figures 7 and 8, we show the effect of this bias- 
ing for our mock quasar. Figure 7 shows the histograms of the 
sphericaUy averaged (within our set of 92 lines) radial compo- 
nent of the peculiar velocity fields constructed from 2 A bins in 
Aobs, at increasing distances away from the quasar. Most of the 
radial velocities closest to the host pixel exhibit strong infall, 
with the PDF peaking at peculiar radial velocities of v « -100 
km/s. However, there is a significant, high- velocity tail in the 
distribution, corresponding to several LOSs that exhibit strong 
outflows (100-500 km/s) close to the host pixel. As the dis- 
tance from the central overdensity increases, the radial velocity 
PDFs become narrower and more centered around km/s, as is 
expected for a randomly chosen point in space. Strong outflow 
features disappear ~ 5 A away from the hne center. After about 
10 A away from the line center, the velocity histogram becomes 
quite symmetric around km/s. 
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These proximity effects have a strong impact on the reso- 
nance optical depth, tr, as shown in Figure 8. The dotted 
curve shows tr calculated assuming the mean density, nuiz) = 
Wff.o (1 +z)^, and not including velocity information from the 
simulation. The dashed curve includes the density field from 
the simulation, but not the velocity field. Finally, the solid 
curve includes both the density and velocity information from 
the simulation box. The values of tr shown in this figure were 
averaged over all LOSs: -ln{e~^''). The figure clearly shows 
that the peculiar velocities around the density peak smooth out 
the spectrum, and create a more gradual decline in the optical 
depth redward of Lya. Overdensities close to the line center 
become regions of relatively low optical depth, since the gas 
Doppler shifts out of resonance. Further away on the blue side 
from the host overdensity, the scatter in the density and veloc- 
ity fields among different LOSs decreases, and the velocity his- 
togram becomes centered around km/s (as mentioned above), 
so the difference between the averaged tr curves becomes sta- 
tistically negligible. 

These figures merely emphasize that density biasing due to 
the host environment is important, and it requires a larger sim- 
ulation with a statistical sample of density peaks to quantify. 
We plan to carry out such an analysis in detail in a future pa- 
per. In particular, the analysis presented in this paper should be 
performed for a larger number of density peaks in other sim- 
ulation boxes for the density bias to be statistically analyzed. 
In actual data analysis, it should help that the host environment 
should scale with the halo size, and therefore with the source's 
luminosity. 
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Fig. 7. — Spherically averaged radial velocity histograms at several dis- 
tances, corresponding to redshifts of z = Xobs/^a - 1, a«fay from the host pixel 
at Zj = 6. The histograms are constructed from different Aobs ranges: 8507.69A 
- 8509.69A (top left), 8504.69A - 8506.69A (top right), 8501.69A - 8503.69A 
(bottom left), and 8498.69A - 8500.69A (bottom right). The fraction of in- 
falling gas decreases with increasing distance from the central overdensity, 
until at about 10 A away from the line center, at which distance the histogram 
becomes nearly syrmnetric around km/s. Close to the central overdensity 
(top left), most of the gas has large negative radial velocities, but there is a 
significant, high-velocity tail in the distribution. 
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Fig. 8. — Values of T;; averaged over all LOSs: — ln(e ^s). The dotted curve 
shows TR calculated assuming the mean IGM density, nH(z) = rifj.o (1 +z)^, 
and not including velocity information. The dashed curve includes the density 
field from the simulation, but not the velocity field. The solid cun'e includes 
both the density and velocity information from the simulation box. The long 
dashed vertical line denotes the Lya line center at 8509.69 A. The solid curve 
can be approximately reproduced by convolving the dashed cmve with the 
velocity PDFs at each Aobs- Including velocity information smooths out the 
spectrum, and creates a more gradual dechne in the optical depth redward of 
Lya. Further away on the blue side from the host overdensity, the difference 
between the averaged tr curves becomes statistically neghgible. 

On the other hand, the biased region does not encompass the 
majority of the spectrum, as mentioned in §4.1. For our mock 
quasar, the biased region extends only ~ 5 A away from the 
line center. However, since our halo is smaller than that ex- 
pected to host typical z ~ 6 quasars, the biased region could be 
larger for a more realistic, more massive halo. Using a semi- 
analytical model, Barkana & Loeb (2004) recently derived the 
gas density and velocity distributions around the relevant halos. 
They find (see their Figure 1) that for halo masses typical of 
those expected to host the bright SDSS quasars, the biased re- 
gion should extend about ~ 1 Mpc (proper). This translates to 
a wavelength range of ~ 20 A and still influences less than 1/6 
of the region used in our analysis. Our initial results therefore 
suggest that one can extract histograms from spectra even 
without modeling the density bias, by ignoring this part of the 
spectrum around the fine center, as was done in this paper. 

It is interesting to ask which part of the optical depth distri- 
bution carries the most statistical power - the low, or the high- 
optical depth pixels, since the low-optical depth pixels are pref- 
erentially effected by biasing effects. We addressed this issue 
by computing the statistical power of fractions of our tr his- 
tograms. We found that similar constraints are obtained by us- 
ing only pixels with tr < 0.1 and by using only pixels with 
Tr > 0.1 (see Fig. 4). Approximately 1/3 of the underdense 
(tr < 0.1) pixels used in our analysis are expected to lie within 
the biased region discussed above. It would clearly be benefi- 
cial to have accurate statistics of the biasing of host halos, but 
this is impractical given current computational constraints. The 
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simplest way to deal with the issue is to remove the biased re- 
gion from the analysis (as was attempted in this paper). Since 
the majority of pixels are not affected, as discussed above, we 
expect our results to be fairly insensitive to biasing. 

As a final test of local bias, we created a new template dis- 
tribution of tr values by using LOSs generated from a new and 
independent simulation box. The densest region in the new box, 
from which LOSs originated, corresponds to a dark matter halo 
of mass Mhaio ~ 1 x lO'^M©, about 1/2 the mass of the densest 
halo in the simulation box used throughout the rest of this pa- 
per. The two template distributions can be seen in Fig. 9, with 
squares representing the new histogram, and crosses showing 
the old histogram (i.e. the top left panel of Figure 4). The two 
histograms are fairly similar (their difference is much smaller 
than those between the models being compared to one another 
in Figure 4). Our old template is somewhat wider, as would be 
expected from the larger density contrast we find in that simula- 
tion box. The density field around the new halo reaches within 
15% of the mean density ~ 5 A blueward from the line center, 
just as in the former halo. 
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Fig. 9. — Histogram of the template distribution of ra from LOSs originating 
at a dark matter halo of mass Mi^io « 2 x IO'^Mq (crosses), and originating 
at a dark matter halo of mass Mi^iio 1 x IO'^Mq (squares). The former 
distribution was used throughout this paper. The later distribution was created 
from a different simulation box, in order to investigate uncertainties in the 
template. The two distributions are similar, with the histogram corresponding 
to a larger mass host halo being somewhat wider due to the larger density 
contrast in the simulation box. 



6. ADDING ADDITIONAL CONSTRAINTS: LIMITS ON 

As discussed above, the dominant factor in this analysis is the 
degeneracy between small xh, small Rs, and large xh, large Rs 
parameter choices. Hence, it would be quite useful to be able 
to independently restrict allowed values of Rs- This is diffi- 
cult since the mean resonant optical depth grows roughly as the 
square of the distance away from the source, so resonance ab- 
sorption inside the Stromgren sphere can be sufficient to block 
out Lya flux alone, especiaUy considering uncertainties in the 



source's ionizing luminosity. 

However, a particularly important piece of information that 
we have not utilized in this paper is that the histograms could 
be constructed as a function of Aobs, instead of lumping them 
into a single histogram regardless of wavelength. In general, 
if a large number of spectra were available, this would contain 
more information, and would help characterize density biases, 
which should be a strong function of distance away from the 
source. 

More specifically, however, Mesinger & Haiman (2004) (here- 
after, MH04) recently presented a method, which takes only 
a crude account of the wavelength-dependence of the opac- 
ity, using essentially only the location of the Lya and Ly/3 GP 
troughs, to find a robust determination of Rs- Since the different 
hydrogen Lyman transitions have disparate oscillator strengths, 
simultaneously considering the measured absorption in two or 
more Lyman Unes can be an effective way to probe a sudden 
growth of the Lya optical depth near the boundary of the Strom- 
gren sphere. In MH04, we applied this technique to model the 
general behavior of the observed spectrum of SDSS J1030H-0524 
close to the onset of the Lya and Ly/3 GP troughs. We obtained 
a tight and robust constraint on Rs (better than 5 %), arising from 
the sharpness of growth of the Lya optical depth. If this (or 
any other) method can be used to independently constrain Rs to 
within ~ 1 0% , it would break the degeneracy between small xh , 
smaU Rs, and large xh, large Rs parameter choices. Specifically, 
if Rs is known to within ~ 10%, a neutral universe could be 
distinguished from axn < 0.008 universe with 99% confidence 
using an average of only 1 source, following the method pre- 
sented in this paper (note that an independent constraint on the 
neutral fraction can be derived from the size of the HII region, 
by utilizing the estimated ionizing flux of the source; Wyithe & 
Loeb 2004). The usefulness of knowing Rs can be seen from 
Figure 2, where all panels have the same Rs and there is a large 
difference between the flux decrements arising from a neutral 
(left panels) and an ionized (right panels) IGM. This is a highly 
encouraging result, and the we plan to apply both the method 
in MH04 and the method presented in this paper to the current 
sample of SDSS quasars. 

7. CONCLUSIONS 

Through an analysis of mock quasar absorption spectra based 
on a detailed cosmological hydrodynamic simulation, we have 
shown that it is possible to detect the effects of the damping 
wing of absorption by neutral hydrogen atoms in the IGM on 
top of the resonant absorption from within the local HII re- 
gion of the quasar. We have described an inversion method 
we have developed to extract an estimate of the mean neutral 
fraction of hydrogen in the IGM, and of the size of the Strom- 
gren sphere around a high-redshift source. The method is de- 
signed to differentiate between sources embedded in an IGM 
with 10"^ < Xh < 1, and we have found that it can distinguish 
among neutral fractions in this range with only a few bright 
quasars. 

We have expUcitly incorporated into our analysis an error in 
the intrinsic emission template, consisting of either an uncer- 
tainty in its spectral power-law index, or Gaussian, uncorre- 
lated, pixel-to-pixel variations at each wavelength. With both 
of these errors, we find that a neutral universe can be statisti- 
cally distinguished from a x// = 0.008 universe in our parameter 
space, using tens of bright quasars, a sample that can be ex- 
pected by the completion of the Sloan Digital Sky Survey. Al- 
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ternatively, similar statistical constraints can be derived from 
the spectra of several hundred sources that are ~ 100 times 
fainter. For example, the Large-aperture Synoptic Survey Tele- 
scope (LSST) should be able to deliver many new faint qua- 
sars that could serve as targets for low-resolution spectroscopy 
(Mesinger et al. 2004, in preparation). 

Furthermore, if the size of the source's Stromgren sphere can 
be independently constrained to within ~ 10% (such as with the 
method presented in MH04), the analysis presented here can 
distinguish between sources embedded in an IGM with 10"^ < 
xh <l, using a single source. We plan to perform such analysis 
on the current sample of high-redshift sources. 

The recent discovery of Gunn-Peterson troughs in the spectra 
of several z^6 quasars in the SDSS only impose the restriction 
of 10"^ < xh < 1 on the neutral fraction at this redshift. Further 
distinguishing between the values allowed in this range, espe- 
cially between a neutral and a mostly ionized universe, would 
provide invaluable new constraints that can differentiate among 
various competing reionization scenarios. 
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